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The full-length genomes of 11 infectious bronchitis virus (IBV) field isolates from three different types of 
the virus; Massachusetts (Mass), Connecticut (Conn) and California (CAL) isolated over a 41,25 and 8 year 
period respectively, were sequenced and analyzed to determine the mutation rates and level of polymor¬ 
phisms across the genome. Positive selection was not detected and mutation rates ranged from 10~ 4 to 
10~ s substitutions/site/year for Mass and Conn IBV types where attenuated live vaccines are routinely 
used to control the disease. In contrast, for CAL type viruses, for which no vaccine exists, positive selection 
was detected and mutation rates were 10 fold higher ranging from 10~ 2 to 10~ 3 substitutions/site/year. 
Lower levels of genetic diversity among the Mass and Conn viruses as well as sequence similarities with 
vaccine virus genomes suggest that the origin of the Mass and all but one of the Conn viruses was likely 
vaccine virus that had been circulating in the field for an unknown but apparently short period of time. 
The genetic data also identified a recombinant IBV isolate with 7 breakpoints distributed across the entire 
genome suggesting that viruses within the same serotype can have a high degree of genetic variability 
outside of the spike gene. These data are important because inaccurate measures of genetic diversity and 
mutation rates could lead to underestimates of the ability of IBV to change and potentially emerge to 
cause disease. 

© 2011 Elsevier B.V. All rights reserved. 


1. Introduction 

Avian infectious bronchitis virus (IBV) a gamma-coronavirus 
in the Coronaviridae family causes a highly contagious upper- 
respiratory tract disease of domestic chickens characterized by 
coughing, sneezing and tracheal rales. Serotype-specific attenu¬ 
ated live vaccines are routinely used to control the disease, yet IBV 
remains one of the most widely reported respiratory diseases of 
chickens worldwide resulting in significant economic loss due to 
reduction in weight gains and feed efficiency, and condemnations 
at the processing plant. The virus can also cause permanent oviduct 
damage in layers and some strains of the virus cause nephritis, 
which when severe can result in significant mortality. 


Abbreviations: IBV, infectious bronchitis virus; nsps, nonstructural proteins; 
ORF, open reading frame; PBS, phosphate buffered saline; RdRp, RNA-dependent 
RNA-polymerase; RT-PCR, reverse transcriptase-polymerase chain reaction; UTR, 
untranslated region. 

* Corresponding author. Tel.: +1 706 542 5475; fax: +1 706 542 5630. 

E-mail address: mjackwoo@uga.edu (M.W. Jackwood). 

0168-1702/$ - see front matter © 2011 Elsevier B.V. All rights reserved, 
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Infectious bronchitis virus is an enveloped, positive-sense 
single-stranded RNA virus with a genome size of approximately 
27 kb (Holmes, 1991). The 5' two-thirds of the genome, approxi¬ 
mately 21 kb, encodes two large overlapping open reading frames 
(ORFs), la and lab, which encode 15 non-structural proteins 
(nsps), nsps 2-16 (IBV does not have nsp 1) including a papain¬ 
like protease 2 (the PL1 protease is truncated and non-functional 
in IBV), a main protease, the RNA-dependent RNA-polymerase 
(RdRp) and several other non-structural proteins involved in 
making up the replication transcription complex. The remain¬ 
ing one-third of the genome encodes four structural proteins; 
spike (S), envelope (E), membrane (M) and nucleocapsid (N), 
and a number of nonstructural proteins. The spike glycoprotein 
of IBV forms projections on the surface of the virion. Spike is 
post-translationally cleaved into SI and S2 subunits with the SI 
subunit forming the outermost portion and S2 forming a stalk like 
structure that is embed in the viral membrane. The SI subunit 
contains hypervariable regions that play a role in attachment to 
host cell receptors, membrane fusion and entry into the host cell, 
and it contains conformationally-dependent virus-neutralizing and 
serotype-specific epitopes (Cavanagh et al„ 1998; Niesters, 1987). 
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It is extremely difficult to control IBV because little or no cross 
protection occurs between numerous different types of the virus. 
Coronaviruses evolve by a process of natural selection working on 
mutations and recombination events (Domingo and Holland, 1997; 
Moya et al., 2004; Vijaykrishna et al., 2007). These factors contribute 
to diverse subpopulations of the virus that continually emerge to 
form new and variant IBV types. In the absence of a specific vaccine, 
the mutation rate for the hypervariable region in the SI gene of the 
793/B type of IBV was reported as 3 x 10~ 3 substitutions/site/year 
(Cavanagh et al., 1998) while the mutation rate for the GA98 virus, 
which emerged as a result of selection pressure driven by vaccine 
usage, was reported to be 1.5 x 10~ 2 substitutions/site/year in the 
hypervariable region of SI (Lee and Jackwood, 2001 ). 

Sequence alignments and phylogenetic reconstructions indicate 
that recombination events play a significant role in coronavirus 
evolution. Recombination in IBV was demonstrated with co- 
infection of different IBV types in embryonated eggs (Kottier et al., 
1995a). Sequence analysis of the SI gene from field isolates of 
IBV also revealed evidence of multiple recombination events (Jia 
et al., 1995; Wang et al„ 1993) and that recombination occurred 
between vaccine viruses (Mondal and Cardona, 2007). In a recent 
study, recombination in IBV, which replaced the spike gene with an 
unknown sequence, likely from another coronavirus, resulted in the 
emergence of turkey coronavirus (TCoV) a new enteric disease of 
young turkeys (Jackwood et al., 2010). 

Characterization of IBV is based on analysis of the spike gly¬ 
coprotein by either serotyping or genotyping methods (Cavanagh 
and Gelb, 2008). However, pathogenicity in IBV is associated with 
spike as well as genes outside of spike (Ammayappan et al., 2009; 
Cavanagh, 2007; Hodgson et al., 2004). Because typing is focused 
on the analysis of spike, there is no clear understanding of whether 
viruses with similar spike proteins also share high sequence sim¬ 
ilarities in other regions of the genome especially in regions 
associated with pathogenicity. This gap in knowledge results from 
the lack of complete genome sequences from viruses of the same 
serotype. 

The objective of this study was to determine the levels of poly¬ 
morphism across the entire genome of IBV isolates with similar 
spike genes and to examine the mutation rates for viruses with and 
without vaccine selection pressure. We sequenced the full-length 
genome from 11 IBV field isolates of three different serotypes, Mas¬ 
sachusetts (Mass), Connecticut (Conn) and California (CAL) isolated 
over a 41,25 and 8-year period respectively. The genome sequences 
were aligned and phylogenetic analysis was conducted to examine 
genetic relationships between the viruses. Estimates of the rates 
of mutation were calculated and evidence of recombination was 
investigated for each of the full-length genome sequences. To our 
knowledge, this is the first study to identify polymorphisms and 
determine mutation rates using full-length genome sequences of 
IB viruses isolated over a multi-year time span. 


2. Materials and methods 

2.1. Viruses 

In this study, we sequenced Mass viruses isolated in 1965,1972, 
1979 and 1985 obtained from Dr. Jack King (Southeast Poultry 
Research Laboratory, Athens, GA) and one isolated in 2006 obtained 
from Dr. Pedro Villegas (Poultry Diagnostic and Research Labora¬ 
tory, Athens, GA). In addition, Conn viruses isolated in 1966,1972, 
1983 and 1991 obtained from Dr. King, and CAL viruses isolated 
in 1995 and 2003 obtained from Dr. Peter Woolcock (California 
Animal Health and Food Safety Laboratory System, University of 
California, Fresno, CA) were sequenced. The specific viruses were 
all isolated from broiler chickens with clinical signs of disease. The 


viruses were propagated in embryonated eggs as described (Gelb 
and Jackwood, 2008). The strains and embryo passage number are; 
IBV/Mass/65 (passage 1), IBV/Mass/72 (passage 7), IBV/Mass/79 
(passage 2), IBV/Mass/85 (passage 15), IBV/Mass/06 (passage 3), 
IBV/Conn/66 (passage 3), IBV/Conn/72 (passage 7), IBV/Conn/83 
(passage 15), IBV/Conn/91 (passage 1), Cal99/NE15172/95 (passage 
3), CAL/CA557/03 (passage 5). 

2.2. Viral RNA extraction, amplification and genome sequence 
determination 

Viral RNA extraction, reverse transcription and polymerase 
chain reaction (RT-PCR), library construction and sequencing were 
conducted as previously described (Jackwood et al., 2010). Briefly, 
the viruses were filtered through a 0.8 pm filter then through a 
0.22 pm filter (Millipore, Billerica, MA) and viral RNA was purified 
using the High Pure RNA Isolation Kit according to the manu¬ 
facturer’s recommendation (Roche Diagnostic Corporation, Foster 
City, CA). The RT-PCR amplifications were performed with the 
Takara RNA LA PCR kit (Takara Bio Inc., Otsu, Shiga, Japan) using 
a random primer and an amplification primer in a strand displace¬ 
ment amplification reaction following the manufacture’s protocol. 
The sequence of the random reverse transcription primer was 5'- 
AGCGGGGGTTGTCGAATGTTTGANNNNN-3', and the amplification 
primer sequence was 5'-AGCGGGGGTTGTCGAATGTTTGA-3'. Both 
primers were obtained from Integrated DNA Technologies, Inc. 
(IDT, Coralville, IA). A master mix for the RT reaction was pre¬ 
pared, which included MgCl (5mM), 10X RNA PCR buffer (lx), 
dNTP mixture (1 mM), RNase inhibitor (1 units/pl), reverse tran¬ 
scriptase (.25 units/pl), 5' degenerate primer (2.5 p-M), and RNA 
(5.75 p.l/reaction) then 10 pi per sample was aliquoted in a ther¬ 
mocycler tube. The RT reaction conditions were 10 min at 30 °C for 
the primer annealing then 1 h at 50 °C for extension followed by a 
five-minute incubation at 99 °C for inactivation of the enzyme and 
a 5-min period at 5 °C. A PCR master mix, which included at the 
final concentrations: MgCl (2.5 mM), 10X LA PCR Buffer (1 x), ster¬ 
ilized distilled water (32.25 pi), Takara LA Taq (1.25 U/50 pi), and 
5' primer (0.2 pM) was prepared and 10 pi of the RT reaction was 
added to 40 pi of the mix. The amplification reaction consisted of a 
94°C step for 2 min followed by 30 cycles of 94 °C for 30 s, 60 °C for 
30 s, and 72 °C for 3 min. 

Ten PCR reactions were combined for each virus and purified 
using the QIAquick PCR Purification Kit (QIAGEN, Foster City, CA) 
then run on a 1% agarose gel. The PCR products were size selected 
by removing amplicons between 500 and 1500 bp from the gel, fol¬ 
lowed by purification using the QIAquick (QIAGEN) gel purification 
kit. 

The PCR products were cloned into the pCR-XL-TOPO vector 
(Invitrogen, Life Technologies, Carlsbad, CA) according to the man¬ 
ufacturers recommendations and transformed into One Shot TOPO 
Electrocompetent E. coli cells (Invitrogen) using 30 pi of compe¬ 
tent cells mixed with 2 pi of the ligation reaction. Electroporation 
settings were 20kV and 200 using a BioRad Gene Pulser (Bio- 
Rad, Hercules, CA), and the electroporated cells were incubated at 
37 °C in 480 pi of Super Optimal broth medium for 1 h on a rotary 
shaker. The cultures were mixed with 70% glycerol and frozen at 
-80 °C until plated on Q-trays (Genetix, Boston, MA) containing 
liquid broth agar CAT#3002-032 (MP Biomedicals, LLC, Solon, Oh) 
with 50 pg/ml of kanamycin. The Q-trays were pre-warmed at 37 °C 
before the entire culture (approximately 500 pi) was spread on the 
plates and incubated overnight at 37 °C, then robotically picked 
with a Q-BOT (Genetix, Boston, MA). 

Plasmid DNA from the libraries of cloned cDNA fragments for 
each virus were isolated using an alkaline lysis method modified 
for the 96-well format, and incorporating both Hydra and Tomtek 
robots (http://www.intl-pag.org/ll/abstracts/P2c_P116_Xl.html). 
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Cycle sequencing reactions were performed using the BigDye™ 
Terminator® Cycle Sequencing Kit Version 3.1 (Applied Biosystems, 
Foster City, CA) and MJ Research (Watertown, MA) thermocy¬ 
clers. Finished reactions were filtered through Sephadex filter 
plates into Perkin-Elmer MicroAmp Optical 96-well plates. A 1/12- 
strength sequencing reaction on an ABI3730 was used to sequence 
each clone from both the 5' and 3' ends. Each viral genome was 
sequenced to approximately lOx coverage. The accuracy of the 
sequence is insured by generating sequence reads for both strands. 

Gaps and areas with <2x coverage were identified and spe¬ 
cific primers were synthesized (IDT) for RT-PCR amplification and 
sequencing of the ambiguous areas. The RT-PCR was conducted as 
described above, and the reaction conditions were 42 °C for 60 min, 
95 °C for 5 min, then ten 10 cycles of 94°C for 30s, 50°C for 30s, 
68 °C for 90 s, followed by 25 cycles of 94 °C for 30 s, 50 °C for 30 s, 
68 °C for 90 s + 5 s/cycle added. The final elongation step was 68 °C 
for 7 min then the reaction was cooled to 4°C. The PCR products 
were directly sequenced in both directions using the amplifica¬ 
tion primers at a concentration of 15 ng/reaction and the ABI Prism 
BigDye Terminator v3.0 (Applied Biosystems, Foster City, CA). The 
amount of cDNA added to the reaction ranged from 20 to 30 ng and 
the sequencing reactions were analyzed on an ABI 3730 (Applied 
Biosystems). 

The extreme 5' end of each genome was obtained using the 
5' RACE system for rapid amplification of cDNA ends, accord¬ 
ing to manufactures protocol (Invitrogen, Life Technologies, 
Carlsbad, CA). The primer design was based upon an align¬ 
ment of IBV sequences. The RACE primers are designated; IBV 
5'RACE SP1 (5'-CGTATAGAAAAACAAAGCGTCAC-3'), IBV 5'RACE 
SP2 (5'-GTCACTGTCTATTGTATGTCTGCTC-3') and IBV 5'RACE SP3 
(5'-TAGCCGACCTTATGCGAGAACG-3'). The extreme 3' end of 
each genome was obtained using primers designated; M41 3' 
end reverse (5'-GCTCTAACTCTATACTAGCCTA-3') and Genome 3'F 
900 bp (5'-TGACAAGATGAATGAGGAAGGTAT-3'). 

Sequences for some viruses were verified and sequence 
gaps were filled by amplification using overlapping 
primer pairs. Briefly, the primer design tool located at 
http://www2.eur.nl/fgg/kgen/primer/Overlapping_Primers.html 
was used with a template genome sequence of a virus with the 
same serotype as the virus to be sequenced. Thirty-two overlap¬ 
ping primer pairs (available upon request) were synthesized (IDT) 
for each virus and used to amplify the genome. The amplicons were 
sequenced directly using the amplification primers and the ABI 
Prism BigDye Terminator v3.0 (Applied Biosystems) as described 
above. 

2.3. Genome sequence analysis 

The sequence reads for each virus were assembled using the 
SeqMan program (DNASTAR, Inc. Madison, WI) and the ORF 
finder at The National Center for Biotechnology Information 
(NCBI, http://www.ncbi.nlm.nih.gov/gorf/gorf.html) was used to 
predict open reading frames. Genome annotation was entered 
using the annotation editor SeqBuilder (DNASTAR, Inc.). The 
BLASTn program (http://www.ncbi.nlm.nih.gov/BLAST/) was used 
to search GenBank (National Center for Biotechnology Informa¬ 
tion, http://www.ncbi.nlm.nih.gov/) for similar IBV sequences. The 
MegAlign program implemented in DNASTAR was used to align 
sequences and to calculate percent identity and number of substi¬ 
tutions between selected sequences. 

2.4. Multiple sequence alignments and phylogenetic analysis 

Whole genome alignments were generated using ClustalW and 
phylogenetic trees were constructed with the Neighbor-Joining 


method, Minimum Evolution method, Maximum Parsimony 
method, and UPGMA with 1000 bootstrap replicates (MEGA4, 
http://www.megasoftware.net/index.html (Tamura et al., 2007)). 
To reconstruct evolutionary trees, Maximum Likelihood (Tree- 
Puzzle, http://www.tree-puzzle.de/) and Bayesian analysis (BEAST 

1.4, http://beast.bio.ed.ac.uk/) were conducted. 

2.5. Molecular evolutionary rates and recombination detection 

A codon-based test of positive selection (Z-test, MEGA4) 
was used to analyze the numbers of non-synonymous and 
synonymous substitutions per site (dN/dS ratio) in the spike 
glycoprotein. Because only 2 CAL type viruses were sequenced 
herein, additional reference sequences for the CAL type viruses 
were included in the codon-based test (Z-test) of positive selec¬ 
tion; CAV/CAV56b/91 (AF027509), CAV/CAV9437/95 (AF027510), 
CAL99/CAL99/99 (AY514485) and CA/CA1737/04 (EU925393). 

To determine mutation rates, the full length genomes as 
well as the coding sequences for ORFlab, SI, E, M and N gene 
from each virus within the same virus type were aligned using 
CLUSTALX 2.0.8 (Larkin et al., 2007) and manually edited in 
Jalview (Waterhouse et al., 2009). The Bayesian Markov chain 
Monte Carlo (MCMC) implemented in BEAST vl.4.8 (Drummond 
and Rambaut, 2007) was used to estimate mutation rates of the 
aligned sequences and the sampling dates for each isolate were 
used as calibration points. The mutation rates were analyzed using 
the general time-reversible (GTR) (Rodriguez et al., 1990) substi¬ 
tution model with gamma-distributed among site rate variation 
with four gamma rate categories (Yang, 1994). The SRD06 model 
was used to partition codon positions and an uncorrelated log¬ 
normal relaxed clock model was selected. Each MCMC chain was 
run for 10 million states and sampled at every 1000 states. Tracer 
vl.4.1 (http://tree.bio.ed.ac.uk/software/tracer/) was used to con¬ 
firm convergence of MCMC chains with 10% of each chain discarded 
as burn-in. 

Complete genome nucleotide sequences for Cal99/NE15172/95, 
IBV/CK/CA99/99, CA/CA557/03, lBV/Conn/66, IBV/Conn/72, 

IBV/Conn/83, IBV/Conn/91, IBV/Mass/65, IBV/Mass/72, 
IBV/Mass/79, IBV/Mass/85, IBV/Mass/06 and the IBV/CK/A2 
China strain (used as out-group) were aligned using CLUSTALX 
2.0.8 (Larkin et al., 2007). To identify recombinants as well as major 
and minor parents, the data set was scanned using a Recombina¬ 
tion Detection Program (RDP) v2 with implemented algorithms 
GENECONV, BootScan, MaxChi, Chimera and SiScan. Similarity plot 
and bootscan analyses were performed using the SimPlot program 
(Lole et al., 1999) to further identify recombination events and 
recombination breakpoints. The window width and step size were 
set to 1000 bp and 40 bp, respectively. 

3. Results 

3.1. Nucleotide sequence accession numbers 

Genome sequences generated in this study were sub¬ 
mitted to the GenBank database and assigned the follow¬ 
ing accession numbers; IBV/Mass/65 (FJ904720), IBV/Mass/72 
(FJ904721), IBV/Mass/79 (FJ904722), IBV/Mass/85 (FJ904723), 
IBV/Mass/06 (FJ904713), IBV/Conn/66 (FJ904716), IBV/Conn/72 
(FJ904717), IBV/Conn/83 (FJ904718), IBV/Conn/91 (FJ904719), 
Cal99/NE15172/95 (FJ904714), CA/CA557/03 (FJ904715). 

GenBank accession numbers for reference genome 
sequences used in this study are: TCoV/VA-74/03 
(GQ427173); TCoV/TX-GL/01 (GQ427174); TCoV/TX-1038/98 
(GQ427176); TCoV/IN-517/94 (GQ427175); TCoV/IN-540/94 

(EU022525); TCoV/MN-ATCC (EU022526); TCoV/Canada-MGlO 
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Table 1 

Codon-based test (Z-test) of positive selection for SI gene sequences. 
Isolate P value 3 


CAL99/NE15172/95 CAL99/CAL99/99 CAL/CA557/03 CA/CA1737/04 CAV/CAV56b/91 CAV/CAV9437/95 


CAL99/NE15172/95 

CAL99/CAL99/99 

CAL/CA557/03 

CA/CA1737/04 

CAV/CAV56b/91 

CAV/CAV9437/95 

0.00 

0.00 

0.00 

0.00 

0.00 

IBV/Conn/66 

1.00 

0.01 

0.00 

0.00 


0.00 

0.00 

0.00 

IBV/Conn/72 

0.00 

0.00 

1.00 

IBV/Conn/83 

IBV/Conn/91 

IBV/Conn/66 








IBV/Conn/72 

0.00 







IBV/Conn/83 

0.00 



0.15 




IBV/Conn/91 

0.00 



0.15 


0.07 



IBV/Mass/65 


IBV/Mass/72 


IBV/Mass/79 

IBV/Mass/85 

IBV/Mass/06 

IBV/Mass/65 








IBV/Mass/72 

0.17 







IBV/Mass/79 

0.11 


0.09 





IBV/Mass/85 

0.16 


0.08 


0.08 



IBV/Mass/06 

1.00 


1.00 


1.00 

1.00 



3 P values <0.05 (boldface) indicates that the null hypothesis of strict-neutrality (dN=dS) is rejected in favor of the alternative hypothesis of dN> dS indicating positive 
selection. 


(EU095850); IBV/Mass 41/41 (AY851295); IBV/Ark/Arkpl 1 

(EU418976); IBV/Ark/Arkpl01 (EU418975); IBV/CK/Beaudette 
(NC001451); IBV/CK/H120 (FJ888351); IBV/CAL99/CAL99/99 
(AY514485); IBV/CK/CH/LSD/051 (EU637854); IBV/CK/SAIBK 
(DQ288927); IBV/CK/BJ (AY319651); PartridgeCoV/GD/S14/03 
(AY646283); PeafowlCoV/GD/KQ.6/03 (AY641576); IBV/CK/A2 
(EU526388); Beluga Whale/CoV/SWl/08 (NC010646); Thrush- 
CoV/HKU12/09 (FJ376621); MuniaCoV/HKU13/09 (FJ376622); 
Bulbul/CoV/HKU11/09 (FJ376619). 


3.2. Full-length genome sequence and organization 

The full-length genome sequence of 11 IBV isolates from 
the field obtained over a 41 year period (5 Mass serotypes), 
25 year period (4 Conn serotypes) and 8-year period (2 CA 
serotypes) were sequenced and the virus genome lengths are; 
IBV/Mass/65, 27, 474 nt; IBV/Mass/72, 27, 474 nt; IBV/Mass/79, 
27, 614 nt; IBV/Mass/85, 27, 474 nt; IBV/Mass/06, 27, 660 nt, 
lBV/Conn/66, 27, 629 nt; lBV/Conn/72, 27, 632 nt; IBV/Conn/83, 
27, 632 nt, IBV/Conn/91, 27, 632 nt; Cal99/NE15172/95, 27, 664 nt; 
CA/CA557/03, 27, 615 nt (Supplementary Table 1). Genomic 
libraries and RT-PCR fragments from overlapping specific primer 
pairs were used to assemble genomic sequences with SeqMan 
Pro and genome annotation was performed using SeqBuilder 
(DNASTAR, Inc., v.8.0.2, Madison, WI), to reveal a typical gamma 
coronavirus gene order for all of the viruses; 5' UTR, ORF1 a/ab, spike 
glycoprotein, ORF3a, ORF3b, envelope protein, membrane protein, 
ORF 4b, ORF 5a, ORF 5b, nucleocapsid protein, ORF 6b, 3' UTR and 
a poly(A) tail. 

The full-length genome sequences were aligned using ClustalW 
and all the phylogenetic trees constructed with the Neighbor- 
Joining method, Minimum Evolution method, Maximum Par¬ 
simony method, and UPGMA using MEGA 4.0.2 had similar 
topography (Tamura et al., 2007). A representative tree with the 11 
viral genomes sequenced herein and other IBV full-length genomes 
available in GenBank is presented in Fig. l.The nucleotide sequence 
similarities between the genomes of the viruses sequenced herein 
are from 93.8% to 94.9% for the CAE isolates, 99.8% to 99.9% between 
the Conn isolates and 92.4 to 100% between the Mass isolates. 


3.3. Phylogenic analysis of untranslated regions (UTRs) and 
individual ORFs 

To determine the relationship between different virus types for 
sequences in specific areas across the genome, phylogenetic analy¬ 
sis of the nucleotide sequences of the 5' and 3' UTRs and the amino 
acid sequences of individual ORFs was conducted ( Fig. 2 ). The spike 
glycoprotein, which is used to define IBV isolate type, shows indi¬ 
vidual groupings for Conn, Mass and CAL viruses with the exception 
of the IBV/Mass/06 virus, which is positioned as an offshoot of 
the Conn group. Similar groupings were observed for the enve¬ 
lope and nsp 3 (PL2) protein trees, whereas IBV/Mass/06 clearly 
grouped with Conn type viruses in the membrane, nucleocapsid, 
nsp 5 (Mpro) and nsp 11/12 (RdRp) protein trees (Fig. 2). A BLASTp 
search showed the IBV/Mass/06 virus spike glycoprotein to have 
98% similarity to HI20 (Acc. No. ACQ55230), which is a commonly 
used Mass type vaccine. A BLASTp search and phylogenetic analysis 
of the other Mass spike glycoprotiens identified Mass type atten¬ 
uated viruses (Acc. Nos. EU283075, EU283081, EU283082), which 
grouped together (data not shown). A BLASTp search and phylo¬ 
genetic analysis of the Conn spike glycoproteins identified Conn 
attenuated viruses (Acc. Nos. EU283059, EU283061 and EU283062), 
which also grouped together (data not shown). 

The phylogenetic analysis of the 5' UTR showed individ¬ 
ual groupings for Conn, Mass, and CAL viruses, however the 
IBV/Mass/06 virus grouped with the CAL viruses (Fig. 2). In the 
3' UTR, the CAL/CA557/03 is an offshoot of the Mass group, 
the IBV/Mass/06 virus is an offshoot of the Conn group, and 
Cal99/NE15172/95 was clearly distinct from the other viruses. A 
BLAST analysis of the Cal99/NE15172/95 virus 3' UTR showed that 
it was 99% identical to the DE072 strain of IBV (Acc. No. AF203002), 
a Delaware type virus. Although the 3' UTRs sequence divergence 
was as high as 13.5% among the viruses, sequence alignment 
of the stem-loop (s2m) region in the 3' UTR revealed identical 
sequences among Cal99/NE15172/95, IBV/Conn/66, IBV/Conn/72, 
IBV/Conn/83, and IBV/Conn/91, one nucleotide difference (nt. posi¬ 
tion 9: A to G) in the CAL/CA557/03 and IBV/Mass/79 viruses, and 
two nucleotide differences (nt. positions 9: A to G and 26: A to 
C) in the IBV/Mass/65, IBV/Mass/72 and IBV/Mass/85 viruses while 
IBV/Mass/06 had an major A and a minor G at nt. position 36 (data 
not shown). 
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Fig. 1 . Infectious bronchitis virus full-length genomic sequences were aligned with ClustalW and the phylogenetic trees were constructed with the Neighbor-Joining method. 
The optimal tree with the sum of branch lengths is shown. The tree is drawn to scale, with branch lengths in the same units as those of the evolutionary distances. A bootstrap 
consensus tree was constructed from 1000 replicates (percentage of replicate trees in which associated strains clustered together are presented at nodes). The p-distance 
scale is presented at the bottom of the figure. The analyses were constructed using MEGA4 (Tamura et al., 2007). 


3.4. Recombination events detected in the Mass06 isolate 

To further examine the relationship of the IBV/Mass/06 virus 
with the other viruses examined in this study as well as selected 
IBV genomes available in GenBank we conducted a Recombina¬ 
tion Detection Program (RDP) analysis, which indicated that the 
IBV/Mass/06 isolate was a recombinant with Conn and CAL viruses 
as the potential major parents (data not shown). The recombina¬ 
tion events in IBV/Mass/06 were identified using Simplot, which 
showed that the IBV/Mass/06 genome shared a high sequence sim¬ 
ilarity, 97-100%, with CAL and Conn isolates in different regions of 
ORFlab, with Mass in the SI gene and with Conn isolates in the 
remainder of the 3' end of the genome (Fig. 3A). At the extreme 
3' end of ORFlab, between nucleotides 15,500 and 19,800, the 
IBV/Mass/06 virus did not show any significant sequence simi¬ 
larity with any of the genomes in this study. A BLAST analysis 
of that region showed 99% similarity with IBV strain HI20 (Acc. 
No. FJ807652). The Simplot bootscan analysis revealed a total 


of seven breakpoints located at approximate nucleotide posi¬ 
tions 1530, 5895, 8188, 11,200, 12,266, 19,918 and 21,975 in the 
IBV/Mass/06 virus (Fig. 3B). The sequences located between the 
breakpoints were designated regions I—VIII, respectively. Regions 
one through five were located in nsp2, nsp3, nsp4, nsp9 and 
nspll/12 respectively. Region VII contained the entire SI glyco¬ 
protein gene, whereas the S2 glycoprotein gene was in region VIII. 

Evidence of recombination in the IBV/Mass/06 isolate was 
further confirmed when phylogenetic trees constructed from 
the regions delineated by the breakpoints showed incongruence 
(Fig. 4). The IBV/Mass/06 isolate grouped with Mass type viruses 
in region VII, which contained the SI gene. However, IBV/Mass/06 
grouped with CAL type viruses in regions I, III and V, and with Conn 
type viruses in regions II, IV and VIII. In region VI, IBV/Mass/06 did 
not group with any of the genomes included in this study. As indi¬ 
cated above, a BLAST search of the IBV/Mass/06 virus sequence in 
that region showed that it was most closely related to the HI20 
Mass type vaccine virus. 
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Fig. 2. Infectious bronchitis virus amino acid sequences A through J as indicated in the figure, were aligned with ClustalW and the phylogenetic trees were constructed using 
the Neighbor-Joining method. The optimal trees with the sum of branch lengths are shown. The trees are drawn to scale, with branch lengths in the same units as those of the 
evolutionary distances (the p-distance scale is presented at the bottom of each tree). Bootstrap consensus trees were constructed from 1000 replicates and the percentage 
of replicate trees in which associated strains clustered together are presented at the nodes. The analyses were conducted using MEGA4 (Tamura et al., 2007). 
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Fig. 3. Simplot analysis to detect recombination and estimate breakpoint within the lBV/Mass/06 genome. (A) Relative gene location and similarity plot using IBV/Mass/06 
as the query sequence. (B) Bootscan analysis showing multiple breakpoints designated as regions I—VIII. Complete genome nucleotide sequences used in the analysis were 
Cal = Cal99/NE15172/95,1BV/CK/CA99/99, CA/CA557/03, Conn = IBV/Conn/66, IBV/Conn/72, lBV/Conn/83, IBV/Conn/91, and Mass = lBV/Mass/65, IBV/Mass/72, IBV/Mass/79, 
lBV/Mass/85. The IBV/CK/A2 China strain was used in the analysis as an out-group. Both analyses were performed with the F84 distance model, window size of 1000 bp and 
step size of 40 bp. 


3.5. Mutation rates 

A codon-based test of positive selection (Z-test, MEGA4), which 
measures the numbers of non-synonymous and synonymous sub¬ 
stitutions per site (dN/dS ratio) showed that positive selection was 
occurring in the CAL viruses and between IBV/Conn/66 and other 
Conn viruses but not in Mass type viruses (Table 1 ). The BEAST 
vl.4.1 software, used to estimate the mutation rates for individual 
coding regions within each virus type, indicated that the CAL iso¬ 
lates had mutation rates that were 2-4 orders of magnitude higher 
than the Mass and Conn isolates (data not shown). Data for the full- 
length genomes and the SI glycoprotein are presented in Table 2. 
The mutation rate for the ORFlab, SI and N gene was calculated to 
be 10~ 3 substitutions/site/year and 10 -2 for the E and M genes from 
the CAL viruses, whereas the Conn and Mass viruses had mutation 
rates that were 10 fold lower and ranged from 10~ 4 to 10~ 6 in all of 
the coding regions except for the E gene in the Conn viruses, which 
did not show any polymorphisms. 

4. Discussion 

The majority of emerging infectious diseases are caused by 
RNA viruses. Presumably, high rates of mutation and rapid repli¬ 
cation drive their evolution. Infectious bronchitis virus, like most 
RNA viruses, undergoes a high rate of genetic change that con¬ 
tributes to diverse subpopulations of the virus that continually 
emerge to form new variants of existing strains as well as com¬ 
pletely new types of the virus. New variants and types of 1BV are 


defined by the sequence of the spike glycoprotein and/or by spe¬ 
cific antibodies against spike, and for many 1BV types, only spike 
sequences are available. To examine sequence changes not only in 
spike but also over the entire length of the genome, we sequenced 
the full-length genomes of 5 Mass type, 4 Conn type and 2 CAL 
type 1BV field viruses that were isolated over a 41, 25 and 8 year 
period respectively. All of the viruses were obtained from chick¬ 
ens with clinical signs of the disease. Alignments of the full-length 
genome sequences showed that the Mass viruses had >92.4% iden¬ 
tity (>93.7% if the recombined IBV/Mass/06 is not included), the CAL 
viruses had >93.8% identity and the Conn viruses varied the least, at 
>99.8% identity. Examination of the mutation rates for the ORFlab, 
SI, E, M and N genes showed, in general, that the CAL viruses had 
rates between 10~ 4 and 10~ 2 substitutions/site/year, whereas the 
Conn and Mass viruses had lower mutation rates that ranged from 
10~ 4 to 10~ 6 substitutions/site/year. In addition, the E gene in the 
Conn viruses showed no signs of change. Although the egg pas¬ 
sage level of all the viruses used in this study was relatively low 
(<15 passages) it is possible some genetic changes could have been 
introduced as a result of propagation in embryonated eggs. How¬ 
ever phenotypic change is likely to be minimal since it typically 
takes 75 or more egg passages to attenuate IBV and in a study that 
examined genetic changes in the highly variable SI gene following 
10 passages in embryonated eggs, five viruses had no changes, two 
viruses had only 2 nucleotide changes one virus had only 1 change 
(McKinley et al., 2008). 

It has been shown that vaccine usage can result in faster evo¬ 
lutionary rates in IBV field viruses (Lee andjackwood, 2001; Wang 
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Fig. 4. The nucleotide sequences A through H as indicated in the figure, were aligned with ClustalW and the phylogenetic trees were constructed using the Neighbor-Joining 
method (Saitou and Nei, 1987). The optimal trees with the sum of branch lengths are shown. The trees are drawn to scale, with branch lengths in the same units as those of 
the evolutionary distances (the p-distance scale is presented at the bottom of each tree). Bootstrap consensus trees were constructed from 1000 replicates and the percentage 
of replicate trees in which associated strains clustered together are presented at the nodes. The analyses were conducted using MEGA4 (Tamura et al., 2007). 
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Table 2 

Mutation and evolutionary rates of the genome and SI genes for California (CAL), Connecticut (Conn) and Massachusetts (Mass) isolates. 


Virus type 

Mutation rate (nucleotide changes/year) 

Evolutionary rate (amino acid changes/year) 

Genome 3 

SI gene b 

Genome 3 

SI gene b 

CAL 

1.0 X to- 2 

9.4 x 10- 4 

NJ 

o 

X 

o 

2.4 x 10- 3 

Conn 

1.5 xlO- 4 

1.7 x 10- 4 

2.6 xlO- 4 

3.5 x 10- 4 

Mass 

1.9 x 10- 3 

2.4 x 10- 4 

3.3 x 10- 3 

5.8 x 10- 4 


a Viruses used in the genome analysis were the CAL, Conn and Mass viruses shown 
b Viruses used in the SI gene analysis are the same viruses shown in Table 1. 

et al„ 1993). Thus, it was surprising that mutation rates for Mass 
and Conn type viruses were lower than mutation rates for CAL type 
viruses because live attenuated vaccines have been used against the 
Mass and Conn type viruses in poultry for many years, whereas no 
vaccine exists for the CAL type viruses. One possible explanation for 
the differences in mutation rates is that the Mass and Conn viruses 
were possibly re-isolated vaccine viruses that have been circulating 
in the field for an unknown but relatively short period of time, with 
the exception of the lBV/Conn/66 virus, which did show positive 
selection when compared to the other Conn viruses. In contrast, 
because no vaccine exists for CAL viruses, those viruses represent 
true field isolates that have been circulating for known periods of 
time. 

The possibility that the origin of the Mass and Conn isolated 
viruses is vaccine, is supported by lower levels of genetic diversity 
observed for many of the genes within each virus type (Fig. 2) as 
well as the genetic similarity of the viruses, previously character¬ 
ized as attenuated viruses. It has been shown that IBV can be shed 
for long periods of time and that vaccine viruses can persist in the 
field and revert to cause disease (Farsang et al„ 2002; Jackwood 
et al„ 2009; Matthijs et al„ 2008; Naqi et al., 2003). Thus, it is possi¬ 
ble that vaccine viruses are more likely to be isolated from the field 
because they may be more abundant having displaced field viruses 
of the same type or because they are more easily isolated since they 
are adapted to grow in embryonated eggs. Because the Mass and 
Conn type viruses were isolated from clinically diseased chickens 
suggests that if they were originally vaccine viruses then they have 
likely acquired the sequence changes necessary for pathogenicity. 
Further studies would need to be conducted on vaccine viruses and 
pathogenic strains to gain a better understanding of the molecular 
changes necessary for coronavirus reversion to pathogenicity. 

Another explanation for lower mutation rates in Mass and Conn 
viruses could be that some of the mutations observed are inciden¬ 
tal, meaning that they are not deleterious and provide no selective 
advantage to the virus rather than due to selection pressure. This 
possibility was suggested in another study where no evolution was 
detected in the 5' hypervariable region of the SI gene for a num¬ 
ber of 793/B type viruses isolated over all year period (Cavanagh 
et al., 1998). This however does not explain why polymorphisms 
were observed for the CAL type viruses. 

Recombination contributes to the genetic diversity of coro- 
naviruses and can lead to the emergence of new viruses and 
outbreaks of new diseases (Holmes, 2009; Jia et al., 1995; Woo et al., 
2009). Although the lBV/Mass/06 virus is identified as a Mass type 
virus based on characterization of the spike glycoprotein, it was 
shown herein to be a recombinant virus with Conn and CAL viruses 
as major parents. Eight regions defined by 7 breakpoints distributed 
across the entire genome were identified. Region VII was simi¬ 
lar to other Mass type viruses, which was expected because that 
region includes the entire SI gene. However, CAL like sequences 
were found in regions I, III and V, Conn sequences were found 
in regions II, IV and VIII, and region VI was found be similar to 
the HI20 Mass type vaccine strain. High frequencies of recombi¬ 
nation including recombination events between vaccine and field 
strains have been reported for IBV (Brooks et al., 2004; Cavanagh 


in Fig. 1. 


et al., 1992; Fang et al., 2005; Jackwood et al., 2010; Kottier et al„ 
1995b; Lee and Jackwood, 2000; Yu et al„ 2001). In addition, IBV 
pathogenicity was shown to be polygenic, involving spike as well as 
replicase proteins (Ammayappan et al., 2009; Armesto et al., 2009; 
Cavanagh, 2007; Hodgson et al., 2004). Although IBV type is defined 
by the spike glycoprotein, our data shows that viruses within the 
same type can have vastly divergent sequences in other regions of 
their genome including regions that contain genes important for 
pathogenicity. If recombination occurs outside of the spike gene 
where a pathogenic field virus acquires attenuated vaccine virus 
sequences, the outcome would likely be a relatively benign virus 
regardless of the virus types (spike sequences) involved. Further¬ 
more, if a vaccine virus circulating in a flock recombines with a 
pathogenic virus outside of the spike gene, resulting in the vaccine 
virus becoming pathogenic, the outcome is still likely to be of little 
or no consequence since the birds would presumably be immune to 
the unchanging spike. However, in the event that a pathogenic field 
virus recombines with either a vaccine virus or another pathogenic 
virus and acquires a new or unique spike glycoprotein gene, the 
outcome could be the emergence of a new virus capable of caus¬ 
ing disease. This was recently shown for TCoV (Jackwood et al„ 
2010). Although new coronaviruses can emerge and cause disease 
as a result of recombination it appears that the emergence of most 
variant or new IBV types is due to accumulation of mutations in 
the spike glycoprotein over time (Cavanagh et al., 2005; Jackwood 
et al., 2005, 2007; Lee and Jackwood, 2001; Nix et al., 2000). Thus, 
until pathogenicity genes can be specifically identified, monitoring 
the spike gene in IBV appears to be the most reliable measure of 
genetic change leading to the emergence of new viruses capable of 
causing disease. 

This is the first study to identify genetic changes and muta¬ 
tion rates using full-length genome sequences of IBVs isolated over 
many years. Our data show that mutation rates are lower and that 
positive selection was not occurring for IBV types where atten¬ 
uated live vaccines are used compared to an IBV type were no 
vaccine exists, suggesting that the virus isolates examined were 
re-isolated vaccine viruses that have been circulating in the field 
for a relatively short-period of time. This observation is extremely 
important because it could lead to incorrect estimates of mutation 
rates for RNA viruses where modified live vaccines are used to con¬ 
trol the disease. In this study, we also identified a recombinant IBV 
isolate with two major parents and no fewer than 7 breakpoints dis¬ 
tributed across the entire genome. Although recombination among 
coronaviruses can contribute to genetic diversity of the virus and 
recombination has been shown to lead to the emergence of new 
viruses, it appears that monitoring mutations in the spike gene that 
accumulate over time is currently the best method of identifying 
potentially new genetic variants and IBV types capable of causing 
disease. 
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Appendix A. Supplementary data 

Supplementary data associated with this article can be found, in 
the online version, at doi: 10.1016/j.virusres.2011.04.006. 
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